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Abstract — Long terminal repeat retrotransposons (LTR-RTs), the major genomic components in plants, can be classified 
into autonomous and nonautonomous elements based on their internal structures and retrotranspositional properties. Large 
numbers of nonautonomous elements have been identified, but the factors, and mechanisms that govern their 
retrotranspositional processes are poorly understood. Here we summarize the recent advance of LTR-RTs in plants, and 
discuss how nonautonomous LTR-RTs were generated, proliferated and evolved in their host genomes, with an emphasis on 
the discussion of the partnership and interaction between nonautonomous elements and their autonomous partners. Thus this 
review will provide insights into the evolution of nonautonomous LTR-RTs, and facilitate our full understanding of the 
retrotranspositional process of LTR-RTs in plants. 
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I. Introduction 

Retrotransposons are a class of mobile elements, which initiate their retrotransposition through a copy-and-paste mechanism 
via RNA intermediates. Retrotransposons can be divided into five orders on the basis of their structural features, long 
terminal repeat retrotransposons (LTR-RTs), DIRS- like elements, Penelope- like elements (PLEs), LINEs and SINEs [1], 
Among them, LTR-RTs are the major genomic components in flowering plants, particularly in species with complex 
genomes. For example, -20% of rice [2], -42% of soybean [3], -55% of sorghum [4], and >75% of the maize genomes [5] 
are composed of LTR-RTs. Recent studies indicate that in diploid species, genome size and TE content show a strong 
positive correlation [6], 

A typical LTR-RT element contains two identical LTRs, a primer-binding site (PBS), a polypurine tract (PPT), gag, and pal, 
two genes necessary for retrotranspositional process [7]. The LTR region can be further divided into three parts, including 
U3, R and U5 [7]. Because two LTRs of an element are identical at the time of insertion, the insertion time can be estimated 
based on the divergence time of the two LTRs and the evolutionary rate of LTR sequences [8]. For example, the majority of 
LTR-RTs in soybean were amplified in the last 1 million years (Mys) [9]. Usually LTR-RTs are subclassified into Copia and 
Gypsy superfamilies based on the order of IN and RT in pal [10]. Occasionally some elements were found to contain an 
additional ORF1 gene upstream of gag, and/or envelope (env)-like gene after the pal. Because both genes are not required for 
the retrotranspositional process, their origin and functional role remains mysterious. Recent genome-wide analysis and multi- 
specific comparisons revealed that these elements were anciently evolved, and lineage-specific [1, 9], Besides intact LTR- 
RTs, a large number of Solo-LTRs and truncated elements have also been found in plant genomes [9, 11, 12], These 
incomplete elements, together with numerous LTR remnants were presumed to be the products of unequal recombination and 
illegitimate recombination, two molecular mechanisms counterbalancing genome expansion [11, 12], For instance, it was 
estimated that >190 Mb of DNA had been removed from the rice genome in the past 8 Mys, leaving the current rice genome 
-400 Mb with -97 Mb DNA of detectable LTR-RTs [12], 

Based on their structural completeness and retrotranspositional capability, LTR-RTs can also be classified into autonomous 
and nonautonomous types. An intact element is defined as autonomous if it encodes all the protein-coding domains necessary 
for catalyzing its retrotransposition [1], By contrast, an element lacking one or more protein coding domains, but still keeping 
its retrotranspositional activity within a time frame, is generally defined as nonautonmous. Large retrotransposon derivatives 
(LARDs) and terminal -repeat retrotransposons in miniature (TRIM) are two groups of LTR-RTs belonging to 
nonautonomous types [13, 14]. Since both LARDs and TRIM have no open reading frames in the internal part, they were 
presumed to transpose by borrowing proteins from their autonomous partners. But for most cases, the relationships between 
autonomous and nonautonomous elements have not been established yet, and the exact mechanism(s) governing the activity 
of nonautonomous elements remains unclear. Although the transpositional mechanism of Dsl in maize, and MITEs in rice 
have been discussed previously [15-19], these nonautonomous elements transpose via a "cut-and-paste" mechanism, and do 
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not undergo reverse transcription process. Thus the transposition mechanism for these elements may be essentially different 
from those of LTR-RTs. 

We have previously identified 510 LTR-RT families in the sequenced soybean genome, and conducted further 
comprehensive analysis of the largest family SNARE [20, 21], This family contains both autonomous and nonautonomous 
subfamilies. We found that nonautonomous elements frequently exchanged the LTR domains with their autonomous partners 
in different timeframes of soybean evolutionary history, thus providing the evidence that autonomous and nonautonomous 
LTR-RTs can interact and communicate with each other. Here we review the recent studies on plant nonautonomous 
elements with respect to the nature, timing, origin, and evolutionary process, thereby providing insights into the 
retrotranspositional process of LTR-RTs. 

II. Structural features of nonautonomous LTR-RTs 

Nonautonomous LTR-RTs are widespread throughout eukaryotic lineages, particularly in flowering plants [22-24]. There are 
two types of nonautonomous elements based on the absence or presence of coding genes in the middle. One type includes 
both LARDs and TRIM, neither of which contains any signature of retrotranspositional related genes, such as gag, and pal 
[13, 14], In LARD elements, the coding region is replaced by a long, conserved noncoding domain. Nevertheless, the internal 
parts of TRIM is almost completely lacking, and the LTRs are quite short, making the intact element very small (<1 kb, 
[13]). The second type includes some nonautonomous elements, such as nonCRRl InonCRRl [25], Retand-1[ 26], BARE-2 
[27, 28], and SNRE [21]. The elements in this type have detectable gag and/or pol, but the coding regions have been either 
highly degraded, or disrupted by frameshifts and stop codons, indicating that these elements are defective, and require 
products from other elements in trans to amplify in their host genome. The majority, if not all, of the nonautonomous 
elements investigated thus far, have two intact LTR sequences, where initiation and termination sites resides in, PBS, and 
PPT. It may reflect the minimum information for nonautonomous elements to move in a genome. It was frequently observed 
that tandem repeats (usually 24-100 bp for each monomer) exist in the internal parts, particularly in the regions between the 
pol and 3' LTR [21, 26, 29]. In spite of the conserved location of tandem repeats, little is known regarding their origin and 
functional role. 


III. The abundance, timing and origin of nonautonomous elements 

Although nonautonomous elements have been frequently identified, their abundance, timing, nature, and origin are not well 
understood. Previous studies revealed that some nonautonomous families, such as Dasheng elements in rice, have a few 
hundred copies, most of which have identical LTRs, indicating that they were amplified quite recently [29]. In contrast, the 
maize family Zeon-1 is one of the oldest families in their host genome [30]. Interestingly, although the nonautonomous LTR 
subfamily SNRE in soybean have bursts at -2 Mys, a subset of these elements with a piggy-backing Solo-LTR were 
dramatically amplified within the last 0.5 Mys [9]. Taken together, these observations indicate that the timing and 
amplification of nonautonomous elements are variable across species and families, and are affected by their host genomes, 
different evolutionary history, and nature selection on the genes involved in retrotranspositional process [31]. 

The origin of nonautonomous elements remains mysterious. Sequence comparisons of centromere RTs revealed that noaCRR 
and CRR elements in rice share substantial sequence similarity of the LTRs, and conserved motifs, including two terminal 
ends of LTR sequences, PBS, and PPT sites, indicating that noaCRR elements were likely derived from CRR [25]. For 
soybean SNARE family, nonautonomous subfamily SNRE elements and autonomous subfamily SARE a elements share the 
same type of tandem repeats, and the majority of SNRE elements and SARE a elements were phylogenetically clustered in a 
monophyletic group [21], Thus it is reasonable to deduce that SNRE elements were derived from SARE a instead of SARE B 
[ 21 ]. 

IV. Retrotransposition and amplification process of nonautonomous elements 

Because nonautonomous elements do not have a full set of genes encoding proteins necessarily for retrotransposition, these 
elements are supposed to use the same or very similar enzyme machinery with their autonomous partners. Using a homology- 
based approach, the putative autonomous and nonautonomous partners, RIRE2 and Dasheng were identified in the rice 
genome [29] . But basically RIRE2 and Dasheng elements were grouped into two distinct clades based on the LTR sequences, 
suggesting few recombination events occurred between RIRE2 and Dasheng families. In the rice genome, four other possible 
partnerships had been established, such as nonCRRl/CRRl , nonCRR2/CRR2, spip/RIRE3, and Squiq/RIRE8 (Table 1, [25, 
32]). For example, in the S. latifolia genome, Rend-1 (nonautonomous) elements were found similarly abundant as Rend-2 
elements (autonomous), and were widely transcribed in all tissues tested [26]. Like RIRE2 and Dasheng, the above partners 
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belonged to Gypsy-like superfamily, and were basically separated into two distinct families. However, a Copia- like family 
BARE-2 and its putative autonomous partners BARE-l/Wis-2 were identified in H. vulgare (Table 1, [27]). More 
interestingly, BARE-2 appears to be a chimeric element, because the two LTRs and 5' UTR regions of BARE-2 were more 
similar to BARE-1, whereas the rest are more similar to another family Wis-2. The chimeric structure was likely generated 
from retrotransposon recombination by strand switching dining replication [27]. 

Table 1 


Summary of the nonautonomous families and their putative autonomous partners 


Species 

Superfamily 

Nonautonomous 

Predicted autonomous 

Reference 

Oryza sativa 

Gypsy -like, 

nociCRRl 

CRR1 

[25] 

Oryza sativa 

Gypsy- like 

noaCRRl 

CRR2 

[25] 

Oryza sativa 

Gypsy- like 

Dasheng 

RIRE2 

[29] 

Oryza sativa 

Gypsy-like 

Spip 

RIRE3 

[32] 

Oryza sativa 

Gypsy- like 

Squiq 

RIRE8 

[32] 

Silene latifolia 

Gypsy -like, 

Retand-1 

Retand-2 

[26] 

Hordeum vulgare 

Copia- like 

BARE-2 

BARE-l/Wis-2 

[27] 

Glycine max 

Gy/t.vy-likc 

SNRE 

SARE 

[21] 


More data regarding RNA level recombination comes from a recent study of SNARE family in soybean (Table 1, [21]). 
SNARE family contains two autonomous subfamilies SARE a and SARE B , and a nonautonomous subfamily SNRE. 
Unexpectedly a subset of SNRE elements (called as SNREf ) carry a foreign Solo-LTR at the same site in the internal region. 
Nonautonomous subfamily SNRE elements and autonomous subfamily SARE share highly identical LTR sequences, identical 
PBS and PPT sites, conserved tandem repeats, similar distribution pattern, and preferential integration sites (overall bias for 
G or C). Furthermore, phylogenetic analysis and case-based structural examination between the recombinants and their 
parental elements revealed that, extensive region-specific sequences have swapped within the recent evolutionary 
timeframes. The majority of the recombinants were difficult to be explained by genomic recombination. In contrast, the 
recombinant LTR structures were more consistent with the RNA recombination model. If the genomic recombination model 
holds true, the new copies will contain chimeric LTR sequences after amplification. But the data showed that the whole LTR 
regions shared highly similarity with that of SARE , and the coding regions were more similar to SNRE [21]. In summary, all 
these observations indicate that, RNA level recombination rather than genomic recombination mediates SNARE evolution, 
and that a molecular mechanism may be involved in the enhancement between autonomous and nonautonomous elements 
[ 21 ]. 


V. Conclusion 

With more genomic sequences are available, many nonautonomous retrotransposons will be identified and annotated. 
Because almost all eukaryote genomes contains transposable elements, further understanding of the structure, evolution and 
replication of nonautonomous elements will be helpful to decipher their important roles in the host genome evolution. 
Perhaps more work will be focused on how and what frequency the nonautonomous elements interplay and communicate 
with their partners, and how they amplify in the genome, regulate gene expression, and drive the host genome evolution. 
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